Constant-alpha

In the video below, you will learn about another improvement that you can make to your Monte Carlo control algorithm.

MC Control: Constant-alpha

## Pseudocode

The pseudocode for constant-\alpha GLIE MC Control can be found below.

## Setting the Value of \alpha

Recall the update equation that we use to amend the values in the Q-table:

Q(S_t, A_t) \leftarrow Q(S_t, A_t) + \alpha (G_t - Q(S_t, A_t))

To examine how to set the the value of \alpha in more detail, we will slightly rewrite the equation as follows:

Q(S_t,A_t) \leftarrow (1-\alpha)Q(S_t,A_t) + \alpha G_t

Watch the video below to hear more about how to set the value of \alpha.

L617 Constant Alpha Edits RENDER V1

Here are some guiding principles that will help you to set the value of \alpha when implementing constant-\alpha MC control:

You should always set the value for \alpha to a number greater than zero and less than (or equal to) one.
- If \alpha=0, then the action-value function estimate is never updated by the agent.
- If \alpha = 1, then the final value estimate for each state-action pair is always equal to the last return that was experienced by the agent (after visiting the pair).
Smaller values for \alpha encourage the agent to consider a longer history of returns when calculating the action-value function estimate. Increasing the value of \alpha ensures that the agent focuses more on the most recently sampled returns.

Important Note: When implementing constant-\alpha MC control, you must be careful to not set the value of \alpha too close to 1. This is because very large values can keep the algorithm from converging to the optimal policy \pi_*. However, you must also be careful to not set the value of \alpha too low, as this can result in an agent who learns too slowly. The best value of \alpha for your implementation will greatly depend on your environment and is best gauged through trial-and-error.